13 research outputs found

    Republishing OpenStreetMap’s roads as linked routable tiles

    Get PDF
    Route planning providers manually integrate different geo-spatial datasets before offering a Web service to developers, thus creating a closed world view. In contrast, combining open datasets at runtime can provide more information for user-specific route planning needs. For example, an extra dataset of bike sharing availabilities may provide more relevant information to the occasional cyclist. A strategy for automating the adoption of open geo-spatial datasets is needed to allow an ecosystem of route planners able to answer more specific and complex queries. This raises new challenges such as (i) how open geo-spatial datasets should be published on the Web to raise interoperability, and (ii) how route planners can discover and integrate relevant data for a certain query on the fly. We republished OpenStreetMap's road network as "Routable Tiles" to facilitate its integration into open route planners. To achieve this, we use a Linked Data strategy and follow an approach similar to vector tiles. In a demo, we show how client-side code can automatically discover tiles and perform a shortest path algorithm. We provide four contributions: (i) we launched an open geo-spatial dataset that is available for everyone to reuse at no cost, (ii) we published a Linked Data version of the OpenStreetMap ontology, (iii) we introduced a hypermedia specification for vector tiles that extends the Hydra ontology, and (iv) we released the mapping scripts, demo and routing scripts as open source software

    A file-based linked data fragments approach to prefix search

    Get PDF
    Text-fields that need to look up specific entities in a dataset can be equipped with autocompletion functionality. When a dataset becomes too large to be embedded in the page, setting up a full-text search API is not the only alternative. Alternate API designs that balance different trade-offs such as archivability, cacheability and privacy, may not require setting up a new back-end architecture. In this paper, we propose to perform prefix search over a fragmentation of the dataset, enabling the client to take part in the query execution by navigating through the fragmented dataset. Our proposal consists of (i) a self-describing fragmentation strategy, (ii) a client search algorithm, and (iii) an evaluation of the proposed solution, based on a small dataset of 73k entities and a large dataset of 3.87 m entities. We found that the server cache hit ratio is three times higher compared to a server-side prefix search API, at the cost of a higher bandwidth consumption. Nevertheless, an acceptable user-perceived performance has been measured: assuming 150 ms as an acceptable waiting time between keystrokes, this approach allows 15 entities per prefix to be retrieved in this interval. We conclude that an alternate set of trade-offs has been established for specific prefix search use cases: having added more choice to the spectrum of Web APIs for autocompletion, a file-based approach enables more datasets to afford prefix search

    Knowledge Graphs Evolution and Preservation -- A Technical Report from ISWS 2019

    Get PDF
    One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this further by asking if we can create a knowledge graph of "everything" ranging from common sense concepts to location based entities. This knowledge graph should be "open to the public" in a FAIR manner democratizing this mass amount of knowledge." Although linked open data (LOD) is one knowledge graph, it is the closest realisation (and probably the only one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides a unique testbed for experimenting and evaluating research hypotheses on open and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing evolution and long term preservation. We want to investigate this problem, that is to understand what preserving and supporting the evolution of KGs means and how these problems can be addressed. Clearly, the problem can be approached from different perspectives and may require the development of different approaches, including new theories, ontologies, metrics, strategies, procedures, etc. This document reports a collaborative effort performed by 9 teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019). Each team provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation. In addition, they provide their working definition for KG preservation and evolution

    How to prototype a client-side route planner for Helsinki with routable tiles and linked connections

    No full text
    Route planning is key in application domains such as delivery services, tourism advice and ride sharing. Today’s route planning as a service solutions do not cover all requirements of each use case, forcing application developers to build their own self-hosted route planners. This quickly becomes expensive to develop and maintain, especially when it requires integrating data from different sources. We demo a configurable route planner that takes advantage of strategically designed data publishing approaches and performs data integration and query execution on the client. For this demonstrator, we (i) publish a Linked Connections interface for the public transit data in Helsinki, including live updates; (ii) integrate Routable Tiles, a tiled Linked Data version of OpenStreetMap road network and (iii) implement a graphical user interface, on top of the Planner.js SDK we have built, to display the query results. By moving the data integration to the client, we provide higher flexibility for application developers to customize their solutions according to their needs. While the querying might be slow today, these preliminary results already hint at different data publishing strategies that may increase query evaluation performance on the client-side

    Publishing public transport data on the Web with the Linked Connections framework

    No full text
    Publishing transport data on the Web for consumption by others poses several challenges for data publishers. In addition to planned schedules, access to live schedule updates (e.g. delays or cancellations) and historical data is fundamental to enable reliable applications and to support machine learning use cases. However publishing such dynamic data further increases the computational burden for data publishers, resulting in often unavailable historical data and live schedule updates for most public transport networks. In this paper we apply and extend the current Linked Connections approach for static data to also support cost-efficient live and historical public transport data publishing on the Web. Our contributions include (i) a reference specification and system architecture to support cost-efficient publishing of dynamic public transport schedules and historical data; (ii) empirical evaluations on route planning query performance based on data fragmentation size, publishing costs and a comparison with a traditional route planning engine such as OpenTripPlanner; (iii) an analysis of potential correlations of query performance with particular public transport network characteristics such as size, average degree, density, clustering coefficient and average connection duration. Results confirm that fragmentation size influences route planning query performance and converges on an optimal fragment size per network. Size (stops), density and connection duration also show correlation with route planning query performance. Our approach proves to be more cost-efficient and in some cases outperforms OpenTripPlanner when supporting the earliest arrival time route planning use case. Moreover, the cost of publishing live and historical schedules remains in the same order of magnitude for server-side resources compared to publishing planned schedules only. Yet, further optimizations are needed for larger networks (>1000 stops) to be useful in practice. Additional dataset fragmentation strategies (e.g. geospatial) may be studied for designing more scalable and performantWeb APIs that adapt to particular use cases, not only limited to the public transport domain

    Client-side route planning: preprocessing the OpenStreetMap road network for Routable Tiles

    Get PDF
    Delva et al. (2019). Client-side route planning: preprocessing the OpenStreetMap road network for Routable Tiles In: Minghini, M., Grinberger, A.Y., Juhász, L., Yeboah, G., Mooney, P. (Eds.). Proceedings of the Academic Track at the State of the Map 2019, 23-24. Heidelberg, Germany, September 21-23, 2019. Available at https://zenodo.org/communities/sotm-2019 DOI: 10.5281/zenodo.338770

    Geospatially partitioning public transit networks for open data publishing

    No full text
    Public transit operators often publish their open data in a data dump, but developers with limited computational resources may not have the means to process all this data efficiently. In our prior work we have shown that geospatially partitioning an operator's network can improve query times for client-side route planning applications by a factor of 2.4. However, it remains unclear whether this works for all network types, or other kinds of applications. To answer these questions, we must evaluate the same method on more networks and analyze the effect of geospatial partitioning on each network separately. In this paper we process three networks in Belgium: (i) the national railways, (ii) the regional operator in Flanders, and (iii) the network of the city of Brussels, using both real and artificially generated query sets. Our findings show that on the regional network, we can make query processing 4 times more efficient, but we could not improve the performance over the city network by more than 12%. Both the network's topography, and to a lesser extent how users interact with the network, determine how suitable the network is for partitioning. Thus, we come to a negative answer to our question: our method does not work equally well for all networks. Moreover, since the network's topography is the main determining factor, we expect this finding to apply to other graph-based geospatial data, as well as other Link Traversal-based applications

    Publishing base registries as linked data event streams

    No full text
    Fostering interoperability, Public Sector Bodies (PSBs) maintain datasets that should become queryable as an integrated Knowledge Graph (KG). While some PSBs allow to query a part of the KG on their servers, others favor publishing data dumps allowing the querying to happen on third party servers. As the budget of a PSB to publish their dataset on the Web is finite, PSBs need guidance on what interface to offer first. A core API can be designed that covers the core tasks of Base Registries, which is a well-defined term in Flanders for the management of authoritative datasets. This core API should be the basis on which an ecosystem of data services can be built. In this paper, we introduce the concept of a Linked Data Event Stream (LDES) for datasets like air quality sensors and observations or a registry of officially registered addresses. We show that extra ecosystem requirements can be built on top of the LDES using a generic fragmenter. By using hypermedia for describing the LDES as well as the derived datasets, agents can dynamically discover their best way through the KG, and server administrators can dynamically add or remove functionality based on costs and needs. This way, we allow PSBs to prioritize API functionality based on three tiers: (i) the LDES, (ii) intermediary indexes and (iii) querying interfaces. While the ecosystem will never be feature-complete, based on the market needs, PSBs as well as market players can fill in gaps as requirements evolve
    corecore